Skip to content

subsys: usb: host: support for usb host cdc-ecm class#100289

Open
D-Veda wants to merge 3 commits intozephyrproject-rtos:mainfrom
nxp-upstream:usbh_cdc_ecm_support
Open

subsys: usb: host: support for usb host cdc-ecm class#100289
D-Veda wants to merge 3 commits intozephyrproject-rtos:mainfrom
nxp-upstream:usbh_cdc_ecm_support

Conversation

@D-Veda
Copy link
Copy Markdown

@D-Veda D-Veda commented Dec 1, 2025

Dependencies:

This implementation contains all commits from #94590 and in addition:

  • Add CDC-ECM class driver usbh_cdc_ecm.c
  • Add ethernet network interface support for USB host

@github-actions
Copy link
Copy Markdown

github-actions Bot commented Dec 1, 2025

Hello @D-Veda, and thank you very much for your first pull request to the Zephyr project!
Our Continuous Integration pipeline will execute a series of checks on your Pull Request commit messages and code, and you are expected to address any failures by updating the PR. Please take a look at our commit message guidelines to find out how to format your commit messages, and at our contribution workflow to understand how to update your Pull Request. If you haven't already, please make sure to review the project's Contributor Expectations and update (by amending and force-pushing the commits) your pull request if necessary.
If you are stuck or need help please join us on Discord and ask your question there. Additionally, you can escalate the review when applicable. 😊

@D-Veda D-Veda changed the title Usbh cdc ecm support subsys: usb: host: Add USB Host CDC ECM Class Support for Ethernet subsys: usb: host: support for usb host cdc-ecm class Dec 1, 2025
@jukkar jukkar removed their request for review December 1, 2025 10:15
@@ -0,0 +1,1157 @@
/*
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is a ethernet driver, so it belongs in drivers/ethernet/. I know there are already others there, but these should probably be moved too. I don't like ethernet drivers and its api be scattered all over the tree. (should be limited to the ethernet and wifi drivers and the net subsystem)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is blocking

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like a bit of discussion would help to decide:

  • Where USB classes (host/device) that implement a driver API are implemented. This would imply most USB classes move, or it would be an exception just for CDC-ECM.

  • Whether source in drivers/ are allowed to call functions from subsystem/. I've not seen anything in the doc but suspect this could interest several parties to review which model to use.

How about pursuing with the implementation details in the meantime that the location of tes code is discussed?

Thank you for the review!

Copy link
Copy Markdown
Member

@maass-hamburg maass-hamburg Dec 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Drivers are allowed to use subsystems.

For example there is the random subsystem subsys/random/, it implements /include/zephyr/random/random.h, which includes sys_rand_get(). Btw the random subsystem uses the entropy_driver_api, if that is selected to be the source of randomness.

For the ethernet drivers, we have a common dt prop zephyr,random-mac-address if that is set, the ethernet mac address is randomly generated with sys_rand_get() during the init of the ethernet driver.

Other example are sd cards in zephyr. (that might be similar enough to usb, as it also has a sd subsystem above the sdhc drivers and api)

The sd card or disk (dt compatible zephyr,sdmmc-disk, drivers/disk/sdmmc_subsys.c) implements the disk api (zephyr/drivers/disk.h) and uses the SD subsystem (subsys/sd) and that uses the sdhc drivers (include/zephyr/drivers/sdhc.h).

We also have drivers for a SDIO wifi card in tree, which uses the sd subsystem it is in
drivers/wifi/infineon/airoc_whd_hal_sdio.c and not in subsys/sd.

Even bigger: We have a crypto device api and one of the drivers is zephyr/drivers/crypto/crypto_mbedtls_shim.c and it is using mbedTLS, which even is a separate module.

Drivers can use anything what they need.

Another thing is control and ensuring quality. There is a reason we have Driver Areas in the maintainers.yml and that in addition to the vendor specific areas for their drivers.
For this PR I wasn't added because I'm the ethernet maintainer, no it was just because I'm also a collaborator for DHCP in the networking subsystem and it's sample got changed.

How about pursuing with the implementation details in the meantime that the location of tes code is discussed?

Sure

Copy link
Copy Markdown
Contributor

@josuah josuah Dec 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your overview of driver <-> subsystem dependencies. I think I heard this "subsystem/driver" split about a particular driver, not as a general guideline.

For this PR I wasn't added because I'm the ethernet maintainer, no it was just because I'm also a collaborator for DHCP in the networking subsystem and it's sample got changed.

Regardless the solution, there would need to make sure that MAINTAINERS.yml pings correctly everyone involved.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jfischer-no can you please share your thoughts on the location of this driver.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All the USB host class drivers that use USBH class driver API and implement a USB class specification or other very well known standard must be placed in subsys/usb/host/class. Similar to what we do with USB device support.

Also, considering that there is not even an Ethernet controller driver API, the complaints and claims here are absolutely unfounded and inappropriate.

The commit usb: host: class: cdc_ecm: Add CDC-ECM host class driver implementation places correctly a new USB host CDC ECM driver in subsys/usb/host/class.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All the USB host class drivers are drivers and as such should not be in subsys/* but instead in drivers/*

@D-Veda D-Veda marked this pull request as draft December 1, 2025 10:42
Comment thread subsys/usb/host/class/usbh_cdc_ecm.c Outdated
Comment thread subsys/usb/host/class/usbh_cdc_ecm.c Outdated
Comment thread subsys/usb/host/class/usbh_cdc_ecm.c Outdated
Comment thread subsys/usb/host/class/usbh_cdc_ecm.c Outdated
Comment thread subsys/usb/host/class/usbh_cdc_ecm.c Outdated
@@ -0,0 +1,1157 @@
/*
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All the USB host class drivers are drivers and as such should not be in subsys/* but instead in drivers/*

Comment thread subsys/usb/host/class/usbh_cdc_ecm.c Outdated
Comment on lines +1632 to +1666
max_segment_size = ctx->max_segment_size;
k_mutex_unlock(&ctx->mutex);

total_len = net_buf_frags_len(buf);
if (total_len > max_segment_size || total_len > CONFIG_USBH_CDC_ECM_DATA_BUF_POOL_SIZE) {
return -EMSGSIZE;
}

tx_buf = net_buf_alloc(&usbh_cdc_ecm_data_pool, K_NO_WAIT);
if (tx_buf == NULL) {
LOG_WRN("failed to allocate data buffer for transmitting");
return -ENOMEM;
}

if (net_buf_linearize(tx_buf->data, net_buf_tailroom(tx_buf), buf, 0, total_len) !=
total_len) {
LOG_ERR("data linearization failed for transmitting");
ret = -EIO;
goto cleanup;
}

net_buf_add(tx_buf, total_len);

ret = usbh_cdc_ecm_start_data_out_xfer(ctx, tx_buf);
if (ret != 0) {
goto cleanup;
}

return 0;

cleanup:
if (tx_buf != NULL) {
net_buf_unref(tx_buf);
}

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

leave it for now.

But for the non-cachable memory, these host controller drivers should instead add cache management functions in the future

@carlescufi carlescufi moved this from Todo to In Progress in Architecture Review Apr 20, 2026
Comment thread include/zephyr/usb/class/usb_cdc.h Outdated
Comment thread subsys/usb/host/class/usbh_cdc_ecm.c Outdated
Comment thread subsys/usb/host/class/usbh_cdc_ecm.c Outdated
Copy link
Copy Markdown
Contributor

@josuah josuah left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for this very incomplete review as I am short on time today.

#99097 (comment) does not seem to be interested in modifying his PR to make it closer to this one. Would you be interested in trying to rebase your PR on top of his content?

If finding modifications needed to #99097, they could then be submitted as review comments to his PR.

Comment thread subsys/usb/host/class/usbh_cdc_ecm.c
Comment thread subsys/usb/host/class/usbh_cdc_ecm.c Outdated
Comment thread subsys/usb/host/class/usbh_cdc_ecm.c Outdated
Comment thread subsys/usb/host/class/usbh_cdc_ecm.c Outdated
Comment thread subsys/usb/host/class/usbh_cdc_ecm.c Outdated
Copy link
Copy Markdown
Contributor

@josuah josuah left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some various feedback around queued_xfers.list, which can hopefully be avoided.

Comment thread subsys/usb/host/class/usbh_cdc_ecm.c Outdated
Comment thread subsys/usb/host/class/usbh_cdc_ecm.c Outdated
Comment thread subsys/usb/host/class/usbh_cdc_ecm.c Outdated
@D-Veda D-Veda force-pushed the usbh_cdc_ecm_support branch from e67d72b to 2561c10 Compare May 4, 2026 15:33
@D-Veda D-Veda requested review from josuah and maass-hamburg May 4, 2026 15:37
@D-Veda D-Veda force-pushed the usbh_cdc_ecm_support branch from 2561c10 to 01dfed3 Compare May 4, 2026 17:02
@D-Veda
Copy link
Copy Markdown
Author

D-Veda commented May 4, 2026

The latest CDC-ECM implementation has improved the following issues

  1. Reduce the use of mutexes, hold locks only in necessary competitive environments, and use atomic flags in USB callbacks to avoid data contention
  2. Refine and streamline the descriptor matching mechanism
  3. Simplify the MAC address query mechanism by first querying according to the default LANGID, and then obtain the LANGID in turn if it fails
  4. Improve the two-way concurrency mechanism, interrupt in multiplexes xfer and associated buf, bulk IN multiplexes xfer, only when data needs to be forwarded to the Ethernet protocol stack to assign new buf associated with xfer, bulk out callback multiplexes xfer to trigger ZLP transmission
  5. All Ethernet interface operations implement error rollback operations (e.g., set packet filter / multicast filter fails, rolling back the previous value)
  6. Streamline the xfer recycling mechanism, only when the dequeue triggers the callback, the xfer will be released, and the rest of the cases are just paused xfer, canceling the dlist tracking mechanism
  7. Streamlined implementation of multicast filters/statistics, updated CDC-ECM statistics-related macros

@D-Veda D-Veda force-pushed the usbh_cdc_ecm_support branch from 01dfed3 to 5822aa9 Compare May 4, 2026 17:23
Comment thread subsys/usb/host/class/Kconfig.cdc_ecm
Comment thread subsys/usb/host/class/usbh_cdc_ecm.c Outdated
Comment on lines +1031 to +1061
static void carrier_work_handler(struct k_work *work)
{
struct k_work_delayable *dwork = k_work_delayable_from_work(work);
struct carrier_work_ctx *const carrier_work =
CONTAINER_OF(dwork, struct carrier_work_ctx, dwork);
struct cdc_ecm_host_data *const host_data =
CONTAINER_OF(carrier_work, struct cdc_ecm_host_data, carrier_work);
int state;

state = atomic_set(&carrier_work->pending_state, CDC_ECM_DEVICE_WORK_CARRIER_STATE_NONE);

if (!atomic_test_bit(&host_data->flags, CDC_ECM_DEVICE_FLAG_CONNECTED) ||
!atomic_test_bit(&host_data->flags, CDC_ECM_DEVICE_FLAG_FORWARDING)) {
return;
}

switch (state) {
case CDC_ECM_DEVICE_WORK_CARRIER_STATE_ON:
LOG_DBG("Carrier ON (deferred)");
net_eth_carrier_on(host_data->eth_iface);
break;

case CDC_ECM_DEVICE_WORK_CARRIER_STATE_OFF:
LOG_DBG("Carrier OFF (deferred)");
net_eth_carrier_off(host_data->eth_iface);
break;

default:
break;
}
}
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is this in a work queue?
net_eth_carrier_on and net_eth_carrier_off should be called directly

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here’s the issue: if a network device supports the ETHERNET_HW_FILTERING feature, and the USB host’s interrupt IN callback in the USB subsystem receives a link-switching notification from the USB device, then if net_eth_carrier_on() is called directly, the network subsystem will directly invoke the code in the ETHERNET_CONFIG_TYPE_FILTER section of .set_config and this code, in turn, calls the USB host’s control transfer function usbh_req_setup(). Thus, the overall call logic is equivalent to triggering a USB control transfer within the USB transfer callback. However, since the usbh_req_setup() is a synchronous API, the thread containing the USB callback will remain blocked until the synchronous usbh_req_setup() API completes, and since the callback triggered by usbh_req_setup is also in the current coroutine, this causes the usbh_req_setup() triggered within the transfer callback to always return a timeout.

@josuah Is it possible to improve the implementation of usbh_req_setup() to support calling it from within a USB transfer callback? If not, is there a better solution here?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have you tested it?

because that is the implementation of net_eth_carrier_on and net_eth_carrier_off:

void net_eth_carrier_on(struct net_if *iface)
{
struct ethernet_context *ctx = net_if_l2_data(iface);
if (!atomic_test_and_set_bit(&ctx->flags, ETH_CARRIER_UP)) {
k_work_submit(&ctx->carrier_work);
}
}
void net_eth_carrier_off(struct net_if *iface)
{
struct ethernet_context *ctx = net_if_l2_data(iface);
if (atomic_test_and_clear_bit(&ctx->flags, ETH_CARRIER_UP)) {
k_work_submit(&ctx->carrier_work);
}
}

the network subsystem itself uses a workqueue for the real action

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there is a difference between net_eth_carrier_on and net_if_carrier_on

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I checked it locally, directly call can work without issue. Now the work queue is removed. Thanks for the reminder.

Comment thread subsys/usb/host/class/usbh_cdc_ecm.c
Comment thread subsys/usb/host/class/usbh_cdc_ecm.c Outdated
Comment on lines +1361 to +1373
if (is_broadcast) {
atomic_add(&host_data->stats[USB_CDC_ECM_STAT_BROADCAST_BYTES_RCV], buf_len);
atomic_inc(&host_data->stats[USB_CDC_ECM_STAT_BROADCAST_FRAMES_RCV]);
} else if (is_multicast) {
atomic_add(&host_data->stats[USB_CDC_ECM_STAT_MULTICAST_BYTES_RCV], buf_len);
atomic_inc(&host_data->stats[USB_CDC_ECM_STAT_MULTICAST_FRAMES_RCV]);
} else {
atomic_add(&host_data->stats[USB_CDC_ECM_STAT_DIRECTED_BYTES_RCV], buf_len);
atomic_inc(&host_data->stats[USB_CDC_ECM_STAT_DIRECTED_FRAMES_RCV]);
}
#endif

return 0;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this needed we already collect these stats in

static void ethernet_update_rx_stats(struct net_if *iface, size_t length,
bool dst_broadcast, bool dst_eth_multicast)
{
if (!IS_ENABLED(CONFIG_NET_STATISTICS_ETHERNET)) {
return;
}
eth_stats_update_bytes_rx(iface, length);
eth_stats_update_pkts_rx(iface);
if (dst_broadcast) {
eth_stats_update_broadcast_rx(iface);
} else if (dst_eth_multicast) {
eth_stats_update_multicast_rx(iface);
}
}
(via net_recv_data)

Comment thread subsys/usb/host/class/usbh_cdc_ecm.c Outdated
}

#if defined(CONFIG_NET_STATISTICS_ETHERNET)
static struct net_stats_eth *eth_get_stats(const struct device *dev, struct net_if *iface)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the returned struct net_stats_eth pointer is not only for reading, but also for saving. take a look at subsys/net/l2/ethernet/eth_stats.h

your export_statistics works only for reading.

you need to provide a mutable struct net_stats_eth here.

also don't do double work, please check wich fields are already filled by the networking subsystem.

Comment thread subsys/usb/host/class/usbh_cdc_ecm.c Outdated
Comment on lines +1535 to +1553
#if defined(CONFIG_NET_STATISTICS_ETHERNET)
if (buf_len > 0) {
atomic_inc(&host_data->stats[USB_CDC_ECM_STAT_XMIT_OK]);

if (is_broadcast) {
atomic_add(&host_data->stats[USB_CDC_ECM_STAT_BROADCAST_BYTES_XMIT],
buf_len);
atomic_inc(&host_data->stats[USB_CDC_ECM_STAT_BROADCAST_FRAMES_XMIT]);
} else if (is_multicast) {
atomic_add(&host_data->stats[USB_CDC_ECM_STAT_MULTICAST_BYTES_XMIT],
buf_len);
atomic_inc(&host_data->stats[USB_CDC_ECM_STAT_MULTICAST_FRAMES_XMIT]);
} else {
atomic_add(&host_data->stats[USB_CDC_ECM_STAT_DIRECTED_BYTES_XMIT],
buf_len);
atomic_inc(&host_data->stats[USB_CDC_ECM_STAT_DIRECTED_FRAMES_XMIT]);
}
}
#endif
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here

Comment thread subsys/usb/host/class/usbh_cdc_ecm.c Outdated
@maass-hamburg
Copy link
Copy Markdown
Member

Just to mention, not every optional feature needs to be in the initial PR, for example statistics are not needed for networking to work. it is a nice to have and can be added in a later PR. If you really want it in here, put it in its own commit.

@D-Veda D-Veda force-pushed the usbh_cdc_ecm_support branch from 5822aa9 to 9aac218 Compare May 4, 2026 19:06
@D-Veda
Copy link
Copy Markdown
Author

D-Veda commented May 4, 2026

Just to mention, not every optional feature needs to be in the initial PR, for example statistics are not needed for networking to work. it is a nice to have and can be added in a later PR. If you really want it in here, put it in its own commit.

The statistics part feature is moved out now.

@D-Veda D-Veda requested a review from maass-hamburg May 4, 2026 19:13
Comment thread subsys/usb/host/class/usbh_cdc_ecm.c Outdated
Comment thread subsys/usb/host/class/usbh_cdc_ecm.c Outdated
@D-Veda D-Veda force-pushed the usbh_cdc_ecm_support branch from 9aac218 to e5acb05 Compare May 4, 2026 19:56
@D-Veda
Copy link
Copy Markdown
Author

D-Veda commented May 6, 2026

Sorry for this very incomplete review as I am short on time today.

#99097 (comment) does not seem to be interested in modifying his PR to make it closer to this one. Would you be interested in trying to rebase your PR on top of his content?

If finding modifications needed to #99097, they could then be submitted as review comments to his PR.

@josuah I'm sorry for replying to you so late.

Compared to #99097, this PR implements more complete functionality required for CDC-ECM:

  • Packet / Multicast Filter
  • Complete and accurate descriptor resolution
  • Precise MAC string lookup
  • Complete instance lifecycle management
  • Complete link status notification
  • statistics is removed from this PR

Originally, this PR was based on #99097. However, during the design and testing process, it was found that in order to fully support the above functions, I needed to make a lot of modifications to #99097 to ensure that the basic functions of CDC-ECM can have a complete instance lifecycle and that no abnormal situations will occur (including ensuring consistency of data flow between network subsystem <-> usb host subsystem, and avoiding resource management or data contention issues such as Use-After-Free). And from reviewing perspective, I need to combine the commits together and add Santhosh Charles as co-author. So the result is that #99097 's commits is merged with my changes and Santhosh Charles is added as co-author.

If #99097 is adopted, I would need to perform extensive refactoring on top of it. From a practical standpoint, I believe directly adopting this PR is the most appropriate approach.

This PR also has advantages over #99097 in basic data stream processing:

  • Use zero-copy to directly forward the USB buffer to the network subsystem in the bulk in direction.
  • Linearization of the buffer was uniformly applied in the bulk out direction.
  • All incoming traffic uses pre-allocated xfer and buffer, and reuses data as much as possible in the callback (the entire xfer / buf process has achieved the lowest possible latency in data forwarding).
  • Use k_sem to manage the maximum number of concurrent send requests in the bulk out direction.
  • Use a separate buffer pool to manage forwarded data

Based on the above, I believe we should directly adopt this PR. Adding changes similar to this PR to #99097 would only increase the effort required, which is practically a waste of time and energy.

D-Veda and others added 3 commits May 7, 2026 11:44
Updates:
- Add Ethernet Statistics Feature Selector marcos
- Add CONNECTION_SPEED_CHANGE notification macro
- Add Ethernet Power Management Pattern activation macros

Signed-off-by: Dv Alan <zhangyang.shen@nxp.com>
Add 4 functions to handle USB string descriptor operations:
- usbh_req_desc_str() to retrieve USB string descriptors from device
- usbh_desc_is_valid_string() to validate string descriptor type
- usbh_desc_get_supported_langs() to get supported languages list
- usbh_desc_str_utfle16_to_ascii() to convert UTF-16LE string to ASCII

Co-authored-by: Johann Fischer <johann.fischer@nordicsemi.no>
Signed-off-by: Dv Alan <zhangyang.shen@nxp.com>
Implements USB CDC-ECM standard protocol and integrates USB-to-Ethernet
devices as network interfaces into Zephyr network stack

Co-authored-by: Santhosh Charles <santhosh@linumiz.com>
Signed-off-by: Dv Alan <zhangyang.shen@nxp.com>
@sonarqubecloud
Copy link
Copy Markdown

sonarqubecloud Bot commented May 7, 2026

@zups
Copy link
Copy Markdown

zups commented May 8, 2026

Hello,

Thanks for this PR, been using it successfully with some device.

However, there's interesting bug sometimes happening (been going on for a while but with newest commits there was also some error message with it) that sometimes the usbh-cdc-ecm driver is in some kind of a lock state but it gets opened up once i run "net ping -I 2 <ip_addr>" where interface 2 is the usbh-cdc-ecm iface.

[00:04:38.519,201] <wrn> usbh_cdc_ecm: bulk_in_req_cb: The bulk IN transfer failed: -5
uart:~$ net ping -I 2 25.25.25.25
PING 25.25.25.25
28 bytes from 25.25.25.25 to 192.168.50.2: icmp_seq=1 ttl=52 time=173 ms
28 bytes from 25.25.25.25 to 192.168.50.2: icmp_seq=2 ttl=52 time=175 ms
28 bytes from 25.25.25.25 to 192.168.50.2: icmp_seq=3 ttl=52 time=171 ms

After this the pipe opens up and data starts to once again flow from it.

Here's some code that claude produces to "fix" it; i could not test this myself properly since I can rarely reproduce this bug.

static int bulk_in_req_cb(struct usb_device *const udev, struct uhc_transfer *const xfer)
{
	struct cdc_ecm_host_data *host_data = xfer->priv;
	struct net_buf *buf = xfer->buf;
	struct net_buf *new_buf = NULL;
	uint16_t max_segment_size;
	int err;

	if (xfer->err == -ECONNRESET) {
		LOG_INF("The bulk IN transfer is cancelled");
		net_buf_unref(buf);
		usbh_xfer_free(udev, xfer);
		return 0;
	}

	if (xfer->err != 0) {
		LOG_WRN("The bulk IN transfer failed: %d", xfer->err);
	}

	if (!atomic_test_bit(&host_data->flags, CDC_ECM_DEVICE_FLAG_CONNECTED) ||
	    !atomic_test_bit(&host_data->flags, CDC_ECM_DEVICE_FLAG_FORWARDING)) {
		return 0;
	}

	max_segment_size = CDC_ECM_DESC_MAX_SEGMENT_SIZE(&host_data->desc);

	/* Pre-allocate replacement buffer BEFORE handing the current one to the stack.
	 * If allocation fails, we drop this packet and reuse the existing buffer,
	 * guaranteeing the RX slot stays alive.
	 */
	if (xfer->err == 0 && buf->len > 0 && buf->len <= max_segment_size) {
		new_buf = net_buf_alloc(&usbh_cdc_ecm_pool, K_NO_WAIT);
	}

	if (new_buf != NULL) {
		/* Hand off old buf to network stack, install new_buf */
		err = forward_received_pkt(host_data, buf);
		if (err != 0) {
			/* Stack rejected it; buf is back in our hands via the
			 * cleanup inside forward_received_pkt. Drop new_buf and
			 * reuse the original.
			 */
			net_buf_unref(new_buf);
			net_buf_reset(buf);
		} else {
			xfer->buf = NULL;
			err = usbh_xfer_buf_add(udev, xfer, new_buf);
			if (err != 0) {
				LOG_ERR("Failed to add buffer to bulk IN transfer: %d", err);
				net_buf_unref(new_buf);
				return 0;
			}
		}
	} else {
		/* Either the transfer errored, the length was bad, or the pool
		 * was exhausted. Drop the packet and reuse the buffer.
		 */
		net_buf_reset(buf);
	}

	err = usbh_xfer_enqueue(udev, xfer);
	if (err != 0) {
		LOG_ERR("Failed to continue bulk IN transfer: %d", err);
	}

	return 0;
}

Some explanation of the fix and what is the potential bug:

Issue: The bulk IN callback orphans transfers when net_buf_alloc fails after handing the current buffer to the stack — xfer->buf is set to NULL, then on alloc failure we goto done without re-enqueuing. Each occurrence permanently kills one RX slot; after RX_PIPELINE_DEPTH hits, the RX pipe is dead until replug. This explains the intermittent "stuck pipe" reports where outbound traffic (e.g. net ping) appears to "wake up" the interface — it doesn't actually recover the dead slots, it just coincides with pool pressure easing.
Fix: Pre-allocate the replacement buffer before handing the current one off. If allocation fails (or the frame is invalid — too short, too long, or transfer errored), drop the packet, reset and reuse the existing buffer, and always re-enqueue. The RX slot can no longer be lost; worst case under memory pressure is a dropped packet, which the network stack handles fine.
Also added a length floor (>= sizeof(struct net_eth_hdr)) to avoid forwarding runt frames to the stack.

Also:

Side note (not blocking this PR): even with the fix, I still see periodic bulk_in_req_cb: The bulk IN transfer failed: -5 warnings at ~5s intervals during idle. The pipe stays alive and throughput is fine, so these appear to be benign — likely the device emitting some periodic frame the HC reports as -EIO

[00:04:03.885,157] <wrn> usbh_cdc_ecm: bulk_in_req_cb: The bulk IN transfer failed: -5
[00:04:08.885,157] <wrn> usbh_cdc_ecm: bulk_in_req_cb: The bulk IN transfer failed: -5
[00:04:13.885,143] <wrn> usbh_cdc_ecm: bulk_in_req_cb: The bulk IN transfer failed: -5
[00:04:21.785,152] <wrn> usbh_cdc_ecm: bulk_in_req_cb: The bulk IN transfer failed: -5
[00:04:26.785,145] <wrn> usbh_cdc_ecm: bulk_in_req_cb: The bulk IN transfer failed: -5
[00:04:31.785,151] <wrn> usbh_cdc_ecm: bulk_in_req_cb: The bulk IN transfer failed: -5
[00:04:43.085,152] <wrn> usbh_cdc_ecm: bulk_in_req_cb: The bulk IN transfer failed: -5
[00:04:51.285,149] <wrn> usbh_cdc_ecm: bulk_in_req_cb: The bulk IN transfer failed: -5
[00:04:56.285,143] <wrn> usbh_cdc_ecm: bulk_in_req_cb: The bulk IN transfer failed: -5
[00:05:01.285,146] <wrn> usbh_cdc_ecm: bulk_in_req_cb: The bulk IN transfer failed: -5
[00:05:25.785,151] <wrn> usbh_cdc_ecm: bulk_in_req_cb: The bulk IN transfer failed: -5
[00:05:34.785,150] <wrn> usbh_cdc_ecm: bulk_in_req_cb: The bulk IN transfer failed: -5
[00:05:39.785,148] <wrn> usbh_cdc_ecm: bulk_in_req_cb: The bulk IN transfer failed: -5
[00:05:44.785,142] <wrn> usbh_cdc_ecm: bulk_in_req_cb: The bulk IN transfer failed: -5
[00:05:53.885,150] <wrn> usbh_cdc_ecm: bulk_in_req_cb: The bulk IN transfer failed: -5
[00:05:58.885,141] <wrn> usbh_cdc_ecm: bulk_in_req_cb: The bulk IN transfer failed: -5
[00:06:03.885,146] <wrn> usbh_cdc_ecm: bulk_in_req_cb: The bulk IN transfer failed: -5
[00:06:08.885,146] <wrn> usbh_cdc_ecm: bulk_in_req_cb: The bulk IN transfer failed: -5
[00:06:28.785,151] <wrn> usbh_cdc_ecm: bulk_in_req_cb: The bulk IN transfer failed: -5
[00:07:29.985,144] <wrn> usbh_cdc_ecm: bulk_in_req_cb: The bulk IN transfer failed: -5
[00:07:34.985,146] <wrn> usbh_cdc_ecm: bulk_in_req_cb: The bulk IN transfer failed: -5
[00:07:39.985,141] <wrn> usbh_cdc_ecm: bulk_in_req_cb: The bulk IN transfer failed: -5
[00:07:44.985,146] <wrn> usbh_cdc_ecm: bulk_in_req_cb: The bulk IN transfer failed: -5
[00:07:49.985,146] <wrn> usbh_cdc_ecm: bulk_in_req_cb: The bulk IN transfer failed: -5
[00:07:58.685,139] <wrn> usbh_cdc_ecm: bulk_in_req_cb: The bulk IN transfer failed: -5
[00:08:03.685,146] <wrn> usbh_cdc_ecm: bulk_in_req_cb: The bulk IN transfer failed: -5
[00:08:08.685,151] <wrn> usbh_cdc_ecm: bulk_in_req_cb: The bulk IN transfer failed: -5
[00:08:13.685,148] <wrn> usbh_cdc_ecm: bulk_in_req_cb: The bulk IN transfer failed: -5
[00:08:18.685,142] <wrn> usbh_cdc_ecm: bulk_in_req_cb: The bulk IN transfer failed: -5
[00:08:23.685,139] <wrn> usbh_cdc_ecm: bulk_in_req_cb: The bulk IN transfer failed: -5
[00:08:28.685,147] <wrn> usbh_cdc_ecm: bulk_in_req_cb: The bulk IN transfer failed: -5

Thanks.

@zups
Copy link
Copy Markdown

zups commented May 8, 2026

It seems like this code did not however manage to fix that, and I am frankly not sure is it even related to this driver or Zephyr RTOS network stack. :) But look into it and if doesn't make any sense or you don't think it is in this driver, then ignore all the previous comments since I am quite unsure what is the problem. :) All i know is that running "net ping -I 2 <ip_addr>" fixes it when sometimes the usbh-cdc-ecm or something is stuck. :) Cheers.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Architecture Review Discussion in the Architecture WG required area: Boards/SoCs area: Devicetree Bindings area: Devicetree area: Networking area: Samples Samples area: Tests Issues related to a particular existing or missing test area: USB Universal Serial Bus platform: NXP NXP

Projects

Status: In Progress

Development

Successfully merging this pull request may close these issues.

9 participants